首页> 外文OA文献 >A measure of association between vectors based on 'similarity covariance'
【2h】

A measure of association between vectors based on 'similarity covariance'

机译:基于“相似度协方差”的向量之间的关联度

摘要

The "maximum similarity correlation" definition introduced in this study is motivated by the seminal work of Szekely et al on "distance covariance" (Ann. Statist. 2007, 35: 2769-2794; Ann. Appl. Stat. 2009, 3: 1236-1265). Instead of using Euclidean distances "d" as in Szekely et al, we use "similarity", which can be defined as "exp(-d/s)", where the scaling parameter s>0 controls how rapidly the similarity falls off with distance. Scale parameters are chosen by maximizing the similarity correlation. The motivation for using "similarity" originates in spectral clustering theory (see e.g. Ng et al 2001, Advances in Neural Information Processing Systems 14: 849-856). We show that a particular form of similarity correlation is asymptotically equivalent to distance correlation for large values of the scale parameter. Furthermore, we extend similarity correlation to coherence between complex valued vectors, including its partitioning into real and imaginary contributions. Several toy examples are used for comparing distance and similarity correlations. For instance, points on a noiseless straight line give distance and similarity correlation values equal to 1; but points on a noiseless circle produces near zero distance correlation (dCorr=0.02) while the similarity correlation is distinctly non zero (sCorr=0.36). In distinction to the distance approach, similarity gives more importance to small distances, which emphasizes the local properties of functional relations. This paper represents a preliminary empirical study, showing that the novel similarity association has some distinct practical advantages over distance based association.For the sake of reproducible research, the software code implementing all methods here (using lazarus free-pascal "www.lazarus.freepascal.org"), including all test data, are freely available at: "sites.google.com/site/pascualmarqui/home/similaritycovariance".
机译:本研究中引入的“最大相似度相关性”定义是由Szekely等人关于“距离协方差”的开创性工作引起的(Ann。Statist。2007,35:2769-2794; Ann。Appl。Stat。2009,3:1236 -1265)。而不是像Szekely等人那样使用欧几里得距离“ d”,我们使用“相似性”,可以将其定义为“ exp(-d / s)”,其中缩放参数s> 0控制相似性随着距离。通过最大化相似度相关性来选择比例参数。使用“相似性”的动机起源于频谱聚类理论(参见,例如,Ng等人,2001,Progresss in Neural Information Processing Systems 14:849-856)。我们表明,对于较大的比例参数值,一种特殊形式的相似度相关性渐近等效于距离相关性。此外,我们将相似性相关性扩展到复数值向量之间的相干性,包括将其划分为实部和虚部。几个玩具示例用于比较距离和相似性相关性。例如,无噪声直线上的点的距离和相似度相关值等于1;但是,无噪圆上的点产生的距离相关性接近零(dCorr = 0.02),而相似性相关性显然非零(sCorr = 0.36)。与距离方法不同,相似性更重视小距离,它强调了功能关系的局部性质。本文代表了一项初步的实证研究,表明与基于距离的关联相比,新型相似性关联具有一些明显的实践优势。为了便于重复研究,软件代码在此处实现了所有方法(使用lazarus free-pascal“ www.lazarus.freepascal .org”)(包括所有测试数据),可从以下网址免费获得:“ sites.google.com/site/pascualmarqui/home/similaritycovariance”。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号